132 research outputs found

    Kizzle: A Signature Compiler for Detecting Exploit Kits

    Get PDF
    In recent years, the drive-by malware space has undergone significant consolidation. Today, the most common source of drive-by downloads are socalled exploit kits (EKs). This paper presents Kizzle, the first prevention technique specifically designed for finding exploit kits. Our analysis shows that while the JavaScript delivered by kits varies greatly, the unpacked code varies much less, due to the kits authors’ code reuse between versions. Ironically, this well-regarded software engineering practice allows us to build a scalable and precise detector that is able to quickly respond to superficial but frequent changes in EKs. Kizzle is able to generate anti-virus signatures for detecting EKs, which compare favorably to manually created ones. Kizzle is highly responsive and can generate new signatures within hours. Our experiments show that Kizzle produces high-accuracy signatures. When evaluated over a four-week period, false-positive rates for Kizzle are under 0.03%, while the false-negative rates are under 5%

    "What It Wants Me To Say": Bridging the Abstraction Gap Between End-User Programmers and Code-Generating Large Language Models

    Full text link
    Code-generating large language models translate natural language into code. However, only a small portion of the infinite space of naturalistic utterances is effective at guiding code generation. For non-expert end-user programmers, learning this is the challenge of abstraction matching. We examine this challenge in the specific context of data analysis in spreadsheets, in a system that maps the users natural language query to Python code using the Codex generator, executes the code, and shows the result. We propose grounded abstraction matching, which bridges the abstraction gap by translating the code back into a systematic and predictable naturalistic utterance. In a between-subjects, think-aloud study (n=24), we compare grounded abstraction matching to an ungrounded alternative based on previously established query framing principles. We find that the grounded approach improves end-users' understanding of the scope and capabilities of the code-generating model, and the kind of language needed to use it effectively

    User Interaction Models for Disambiguation in Programming by Example

    Get PDF
    Programming by Examples (PBE) has the potential to revolutionize end-user programming by enabling end users, most of whom are non-programmers, to create small scripts for automating repetitive tasks. However, examples, though often easy to provide, are an ambiguous specification of the user's intent. Because of that, a key impedance in adoption of PBE systems is the lack of user confidence in the correctness of the program that was synthesized by the system. We present two novel user interaction models that communicate actionable information to the user to help resolve ambiguity in the examples. One of these models allows the user to effectively navigate between the huge set of programs that are consistent with the examples provided by the user. The other model uses active learning to ask directed example-based questions to the user on the test input data over which the user intends to run the synthesized program. Our user studies show that each of these models significantly reduces the number of errors in the performed task without any difference in completion time. Moreover, both models are perceived as useful, and the proactive active-learning based model has a slightly higher preference regarding the users' confidence in the result

    Genome-wide association study identifies 32 novel breast cancer susceptibility loci from overall and subtype-specific analyses.

    Get PDF
    Breast cancer susceptibility variants frequently show heterogeneity in associations by tumor subtype1-3. To identify novel loci, we performed a genome-wide association study including 133,384 breast cancer cases and 113,789 controls, plus 18,908 BRCA1 mutation carriers (9,414 with breast cancer) of European ancestry, using both standard and novel methodologies that account for underlying tumor heterogeneity by estrogen receptor, progesterone receptor and human epidermal growth factor receptor 2 status and tumor grade. We identified 32 novel susceptibility loci (P < 5.0 × 10-8), 15 of which showed evidence for associations with at least one tumor feature (false discovery rate < 0.05). Five loci showed associations (P < 0.05) in opposite directions between luminal and non-luminal subtypes. In silico analyses showed that these five loci contained cell-specific enhancers that differed between normal luminal and basal mammary cells. The genetic correlations between five intrinsic-like subtypes ranged from 0.35 to 0.80. The proportion of genome-wide chip heritability explained by all known susceptibility loci was 54.2% for luminal A-like disease and 37.6% for triple-negative disease. The odds ratios of polygenic risk scores, which included 330 variants, for the highest 1% of quantiles compared with middle quantiles were 5.63 and 3.02 for luminal A-like and triple-negative disease, respectively. These findings provide an improved understanding of genetic predisposition to breast cancer subtypes and will inform the development of subtype-specific polygenic risk scores
    • …
    corecore